SUPPORT / SAMPLES & SAS NOTES
 

Support

Problem Note 63692: Incorrect tables and graphs are displayed in the PROC PRINCOMP "Analyzing Job Ratings of Police Officers" documentation example

DetailsAboutRate It

Example 3, "Analyzing Job Ratings of Police Officers", in the PROC PRINCOMP documentation has incorrect tables and graphs. The double trailing @ line-hold specifier is missing from the INPUT statement, causing the Jobratings data to be incomplete. The printed tables and graphs are based on a 37 observation subset of the intended data set. 

Here is the corrected example. The output is based on all 103 observations in the Jobratings data set.

Example 3 Analyzing Job Ratings of Police Officers

This example uses the PRINCOMP procedure to analyze job performance. Police officers were rated by their supervisors in 14 categories as part of standard police department administrative procedure.

The following statements create the Jobratings data set:

options validvarname=any;
data Jobratings;
   input 'Communication Skills'n         'Problem Solving'n
         'Learning Ability'n             'Judgment Under Pressure'n
         'Observational Skills'n         'Willingness to Confront Problems'n
         'Interest in People'n           'Interpersonal Sensitivity'n
         'Desire for Self-Improvement'n  'Appearance'n
         'Dependability'n                'Physical Ability'n
         'Integrity'n                    'Overall Rating'n @@;
   datalines;
2 6 8 3 8 8 5 3 8 7 9 8 6 7 7 4 7 5 8 8 7 6 8 5 7 6 6 7 5 6 7 5 7 8 6 3 7 7 5
8 7 5 6 7 8 6 9 7 7 7 9 8 8 9 9 7 9 9 9 9 7 7 9 8 8 7 8 8 8 8 8 9 8 9 7 8 9 9
8 8 8 7 9 9 8 9 9 9 9 8 8 9 8 9 9 7 9 8 8 7 7 9 4 7 9 8 4 6 8 8 8 6 3 5 6 5 2
3 3 5 1 4 3 1 1 3 8 9 8 8 8 8 7 9 5 7 6 8 6 7 7 6 5 5 7 8 9 9 4 4 6 3 9 7 9 7
8 8 9 9 9 8 8 9 8 9 8 9 7 6 7 6 6 6 7 7 5 9 8 8 8 8 7 7 6 6 7 6 7 6 7 7 9 6 7
7 6 3 8 3 9 9 3 2 5 8 8 8 5 6 2 5 7 3 8 8 1 1 2 8 4 9 1 5 8 8 8 7 9 9 6 6 7 9
7 9 8 8 8 7 9 7 9 8 7 7 9 5 9 6 7 9 8 7 9 8 9 9 7 5 8 7 8 7 9 8 9 9 8 8 9 9 8
8 8 9 8 8 8 8 7 8 8 7 6 7 6 5 6 8 7 6 7 7 8 8 8 8 9 8 8 8 8 9 9 8 9 9 8 8 8 8
9 9 8 8 8 7 8 9 8 8 6 7 6 4 6 5 7 7 3 8 4 7 7 6 7 8 7 7 8 7 8 8 7 9 9 9 9 7 7
6 8 8 8 8 6 6 7 6 8 6 6 7 6 7 6 7 8 6 6 5 7 4 6 7 7 6 3 3 4 2 4 4 7 6 6 6 4 8
5 5 6 5 6 5 6 7 6 5 7 8 5 7 6 6 5 4 5 6 6 6 7 6 5 6 5 8 6 6 5 6 6 5 5 5 6 6 6
5 6 7 7 5 8 8 8 8 9 9 8 8 8 6 8 8 8 7 8 9 8 9 9 9 9 9 8 9 8 7 9 9 9 8 8 8 9 9
9 9 8 9 9 8 9 9 5 7 5 5 4 7 7 6 4 6 8 8 7 8 5 3 6 8 7 7 7 7 7 9 7 8 8 7 6 8 6
6 6 7 1 6 4 7 5 7 6 7 7 8 7 7 8 8 8 9 7 9 8 9 9 7 6 7 3 6 4 7 6 7 5 6 5 8 4 6
7 7 6 7 8 8 6 5 8 8 6 7 6 7 6 8 6 9 8 9 5 5 6 6 9 9 9 8 5 5 5 4 6 8 6 6 6 6 3
8 8 6 6 8 8 8 8 9 9 9 9 9 8 9 8 9 9 7 7 8 7 8 8 8 7 9 8 9 9 9 7 6 6 7 7 8 9 9
7 9 9 9 9 7 4 4 7 5 4 6 8 7 8 7 7 7 8 7 7 7 8 7 6 6 7 8 7 9 8 8 8 8 7 6 6 6 8
7 7 8 7 9 9 7 9 7 5 7 6 5 3 6 3 4 3 6 1 5 4 3 7 6 7 7 7 7 4 5 6 5 3 6 5 6 7 6
7 6 6 6 6 5 6 5 6 6 7 6 8 8 8 8 8 8 8 8 8 7 8 7 8 9 8 8 9 7 7 8 8 8 8 6 9 7 7
8 5 8 8 9 4 8 8 8 7 4 7 8 8 6 5 8 6 7 4 5 6 5 4 7 3 6 7 6 7 6 7 7 7 7 6 7 7 7
7 7 7 7 7 7 7 8 8 8 7 8 7 8 9 7 9 8 9 8 9 8 9 9 8 7 9 9 9 8 6 8 6 6 7 2 9 9 1
1 4 7 4 7 1 3 9 8 8 8 9 9 7 6 9 9 9 9 8 8 8 8 7 8 6 8 5 6 6 6 7 7 4 8 7 7 8 6
8 8 8 7 8 9 7 8 8 9 9 9 9 9 9 9 8 6 9 9 9 9 9 9 4 6 6 8 8 5 8 7 6 1 6 8 8 6 6
6 7 5 5 7 7 8 4 8 6 7 7 6 8 7 7 7 7 7 8 8 8 8 9 7 9 7 6 5 6 6 6 6 5 6 5 4 5 9
7 6 7 3 5 7 4 4 8 8 8 8 7 6 8 7 7 4 7 5 5 5 5 6 5 8 6 5 9 6 7 6 6 7 7 7 7 8 7
8 9 7 9 7 8 7 8 7 8 7 4 6 7 7 7 6 6 7 8 6 7 7 6 9 5 5 8 7 4 8 7 7 7 7 8 8 8 7
6 7 7 7 8 6 7 8 6 5 7 7 8 7 8 7 7 7 8 9 9 7 5 8 7 8 6 8 8 7 7 8 7 9 8 7 6 5 7
8 7 7 6 6 6 7 6 7 7 8 8 6 7 7 7 8 7 5 4 6 8 7 7 7 6 7 7 8 8 8 7 7 7 5 7 7 7 7
7 7 7 7 8 9 6 7 8 5 5 8 6 7 6 7 8 8 7 8 7 6 7 6 7 7 7 7 2 4 7 8 6 5 8 5 5 3 5
8 6 6 4 6 5 3 2 3 4 3 5 4 2 5 3 3 3 5 5 6 6 7 6 6 6 7 6 7 8 4 1 1 2 3 1 2 1 4
2 1 1 2 1 1 7 6 8 8 6 5 8 8 5 3 6 8 8 7 5 7 7 8 4 7 8 8 6 8 8 5 8 9 5 6 6 6 7
7 6 6 4 6 5 6 6 6 6 6 7 8 7 7 7 8 7 7 8 8 9 8 7 7 6 8 7 9 9 8 8 7 7 9 9 7 7 6
6 6 8 8 8 8 5 4 6 6 7 6 6 6 4 7 7 9 8 7 5 8 9 9 9 8 8 6 7 8 8 9 7 6 8 8 4 5 9
7 7 7 8 6 8 7 6 5 7 8 5 4 7 7 9 9 9 8 8 8 8 8 9 8 7 8 8 8 6 5 9 4 8 9 3 3 8 8
6 4 5 7 9 9 9 9 9 8 7 7 9 8 8 8 9 8 9 6 6 3 6 7 3 6 8 7 6 5 8 7 9 8 6 7 6 8 8
7 7 9 8 9 6 8 8 7 8 7 8 8 7 7 8 9 8 9 7 9 8 8 8 9 7 8 8 8 8 8 8 7 8 8 9 9 9 9
7 8 9 9 7 9 9 7 9 9 9 9 8 9 9 8 9 9 8 9 9 8 9 9 7 6 6 5 6 3 9 9 5 6 7 4 8 6
;

The Jobratings data set contains 14 variables. Each variable contains the job ratings, which use a scale measurement from 1 to 10 (1=fail to comply, 10=exceptional). The last variable, Overall Rating, contains a score as an overall index of how each officer performs.

The following statements request a principal component analysis of the Jobratings data set, output the scores to the Scores data set (OUT= Scores), and produce default plots. Note that the variable Overall Rating is excluded from the analysis.

ods graphics on;

proc princomp data=Jobratings(drop='Overall Rating'n);
run;

Output 3.1 and Output 3.2 display the PROC PRINCOMP output, beginning with simple statistics and then the correlation matrix. By default, PROC PRINCOMP computes principal components from the correlation matrix, so the total variance is equal to the number of variables, 13. In this example, it would also be reasonable to use the COV option, which would cause variables that have a high variance (such as Dependability) to influence the results more than variables that have a low variance (such as Learning Ability). If you used the COV option, scores would be computed from centered rather than standardized variables.

Output 3.1: Simple Statistics and Correlation Matrix from Using PROC PRINCOMP

The PRINCOMP PROCEDURE

NobsNvars

SimpleStatistics

CorrMatrix

CorrMatrix

Output 3.2 displays the eigenvalues. The first principal component accounts for about 50% of the total variance, the second principal component accounts for about 13.6%, and the third principal component accounts for about 7.7%. Note that the eigenvalues sum to the total variance. The eigenvalues indicate that three to five components provide a good summary of the data: three components account for about 71.7% of the total variance, and five components account for about 82.7%. Subsequent components account for less than 5% each.

Output 3.2: Eigenvalues and Eigenvectors from Using PROC PRINCOMP

EigenValues

EigenVectors

PROC PRINCOMP produces the scree plot as shown in Output 3.3 by default when ODS Graphics is enabled. You can obtain more plots by specifying the PLOTS= option in the PROC PRINCOMP statement.

The scree plot on the left shows that the eigenvalue of the first component is approximately 6.5 and the eigenvalue of the second component is largely decreased to under 2.0. The variance explained plot on the right shows that the first four principal components account for nearly 80% of the total variance.

Output 3.3: Scree Plot from Using PROC PRINCOMP

ScreePlot

The first component reflects overall performance, because the first eigenvector shows approximately equal loadings on all variables. The second eigenvector has high positive loadings on the variables Observational Skills and Willingness to Confront Problems but even higher negative loadings on the variables Interest in People and Interpersonal Sensitivity. This component seems to reflect the ability to take action, but it also reflects a lack of interpersonal skills. The third eigenvector has a very high positive loading on the variable Physical Ability and high negative loadings on the variables Problem Solving and Learning Ability. This component seems to reflect physical strength, but it also shows poor learning and problem-solving skills.

In short, the three components represent the following:

First component:

overall performance

Second component:

smartness, toughness, and introversion

Third component:

superior strength and average intellect

PROC PRINCOMP also produces other plots besides the scree plot, that help interpret the results. The following statements request plots from the PRINCOMP procedure:

proc princomp data=Jobratings(drop='Overall Rating'n)
              n=5 plots(ncomp=3)=all;
run;
 

The N=5 option sets the number of principal components to five. The option PLOTS(NCOMP=3)=ALL produces all plots but limits to three the number of components that are displayed in the component pattern plots and the component score plots.

Output 3.4 shows a matrix plot of component scores for the first five principal components. The histogram of each component is displayed in the diagonal element of the matrix. The histograms indicate that the first principal component is skewed to the left and the second principal component is slightly skewed to the right.

Output 3.4: Matrix Plot of Component Scores

ComponentScoresMatrix

 

The pairwise component pattern plots are shown in Output 3.5 through Output 3.7. The pattern plots show the following:

  • All variables positively and evenly correlate with the first principal component (Output 3.5 and Output 3.6).
  • The variables Observational Skills and Willingness to Confront Problems correlate highly with the second component, and the variables Interest in People and Interpersonal Sensitivity correlate highly but negatively with the second component (Output 3.5).
  • The variable Physical Ability correlates highly with the third component, and the variables Problem Solving and Learning Ability correlate highly but negatively with the third component (Output 3.6).
  • The variables Observational SkillsWillingness to Confront ProblemsInterest in People, and Interpersonal Sensitivity correlate highly (either positively or negatively) with the second component, but all these variables have very low correlations with the third component; the variables Physical Ability and Problem Solving correlate highly (either positively or negatively) with the third component, but both variables have very low correlations with the second component (Output 3.7).

Output 3.5: Pattern Plot of Component 2 by Component 1

patternplot2vs1

 

Output 3.6: Pattern Plot of Component 3 by Component 1

patternplot3vs1

 

Output 3.7: Pattern Plot of Component 3 by Component 2

patternplot3vs2.png

 

Output 3.8 shows a component pattern profile. As is shown in the pattern plots, the nearly horizontal profile from the first component indicates that the first component is mostly correlated evenly across all variables.

Output 3.8: Component Pattern Profile Plot from Using PROC PRINCOMP

patternprofileplot.png

 

Output 3.9 through Output 3.11 display the pairwise component score plots. Observation numbers are used as the plotting symbol.

Output 3.9 shows a scatter plot of the first and second components. Observations 4 and 31 seem like outliers on the first component. Observations 22 and 30 can be potential outliers on the second component.

Output 3.10 shows a scatter plot of the first and third components. Observations 4 and 31 seem like outliers on the first component.

Output 3.11 shows a scatter plot of the second and third components. Observations 22 and 30 can be potential outliers on the second component.

Output 3.12 shows a scatter plot of the second and third components, displaying the first component in color. Color interpolation ranges from red (minimum) to blue (middle) to green (maximum).

Output 3.9: Component 2 versus Component 1

scoreplot2vs1

 

Output 3.10: Component 3 versus Component 1

scoreplot3vs1

 

Output 3.11: Component 3 versus Component 2

scoreplot3vs2

 

Output 3.12: Component 3 versus Component 2, Painted by Component 1

paintedscoreplot

 



Operating System and Release Information

Product FamilyProductSystemProduct ReleaseSAS Release
ReportedFixed*ReportedFixed*
SAS SystemSAS/STATSolaris for x649.314.39.3 TS1M09.4 TS1M5
Linux for x649.314.39.3 TS1M09.4 TS1M5
Linux9.314.39.3 TS1M09.4 TS1M5
HP-UX IPF9.314.39.3 TS1M09.4 TS1M5
64-bit Enabled Solaris9.314.39.3 TS1M09.4 TS1M5
64-bit Enabled HP-UX9.314.39.3 TS1M09.4 TS1M5
64-bit Enabled AIX9.314.39.3 TS1M09.4 TS1M5
Windows Vista for x649.39.3 TS1M0
Windows Vista9.39.3 TS1M0
Windows 7 Ultimate x649.314.39.3 TS1M09.4 TS1M5
Windows 7 Ultimate 32 bit9.314.39.3 TS1M09.4 TS1M5
Windows 7 Professional x649.314.39.3 TS1M09.4 TS1M5
Windows 7 Professional 32 bit9.314.39.3 TS1M09.4 TS1M5
Windows 7 Home Premium x649.314.39.3 TS1M09.4 TS1M5
Windows 7 Home Premium 32 bit9.314.39.3 TS1M09.4 TS1M5
Windows 7 Enterprise x649.314.39.3 TS1M09.4 TS1M5
Windows 7 Enterprise 32 bit9.314.39.3 TS1M09.4 TS1M5
Microsoft Windows XP Professional9.39.3 TS1M0
Microsoft Windows Server 2008 for x649.314.39.3 TS1M09.4 TS1M5
Microsoft Windows Server 2008 R29.314.39.3 TS1M09.4 TS1M5
Microsoft Windows Server 20089.314.39.3 TS1M09.4 TS1M5
Microsoft Windows Server 2003 for x649.39.3 TS1M0
Microsoft Windows Server 2003 Standard Edition9.39.3 TS1M0
Microsoft Windows Server 2003 Enterprise Edition9.39.3 TS1M0
Microsoft Windows Server 2003 Datacenter Edition9.39.3 TS1M0
Microsoft® Windows® for x649.314.39.3 TS1M09.4 TS1M5
z/OS9.314.39.3 TS1M09.4 TS1M5
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.